LoopInvader: A Compiler for Tightly Coupled Processor Arrays

نویسندگان

  • Alexandru Tanase
  • Michael Witterauf
  • Éricles Sousa
  • Vahid Lari
  • Frank Hannig
  • Jürgen Teich
چکیده

Continuous technology miniaturization allows to build massively parallel embedded computer architectures within a single silicon chip. Programming that leverages the abundant parallelism in such architectures, however, is very difficult, tedious, and error-prone. Thus, compiler support is paramount. We therefore present LoopInvader, a loop compiler for a particular class of massively parallel processor arrays: Tightly coupled processor arrays (TCPAs) [2]. TCPAs consist of a two-dimensional array of VLIW processing elements (PEs) and several peripheral components that enable zero-overhead loops. In particular, a global controller (GC) generates synchronized control signals that govern the control flow of the PEs, removing control overhead from the loops; address generators (AG) produce the necessary addresses for feeding the PEs with data from reconfigurable buffers, removing addressing overhead. Moreover, the PEs are connected to their neighbors via a circuit-switched interconnection network that is reconfigurable at runtime to optimally accommodate the running application. Figure 1 depicts an overview of our high-level programming methodology. We describe programs in a domain-specific functional language called PAULA that is based on dynamic piecewise linear/regular algorithms (DPLA) [3], a mathematical representation of loop programs. For the parallelization and mapping of such algorithms onto TCPAs we use symbolic partitioning techniques [5] in the polyhedral model: Instead of using fixed tile sizes, our symbolic partitioning technique is able to keep the size of the input data and the number of PEs symbolic until runtime. This provides applications more flexibility and is important in resource-aware computing paradigms such as invasive computing [4]. Other approaches are both time-consuming (e. g., dynamic recompilation) and costly (e. g., pre-compiling multiple variants) on embedded systems. After mapping, the compiler generates a configuration stream comprising assembly code for the PEs, interconnect configuration, address generator configuration and global controller configuration 1. Because the PEs offer only small instruction memories, we developed an approach to generate code that is independent of the problem size [1]. This is achieved by finding processors and program blocks within processors that share the same code and appropriately combining it into loops. As the PEs are interconnected by a circuit-switched interconnect, the compiler also generates all necessary configuration information. For preserving a given schedule of instructions, code for the GC is generated such that the repetitive execution of each unique program block does not cause any extra cycles.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Temperature modeling and emulation of an ASIC temperature monitor system for Tightly-Coupled Processor Arrays (TCPAs)

This contribution provides an approach for emulating the behaviour of an ASIC temperature monitoring system (TMon) during run-time for a tightly-coupled processor array (TCPA) of a heterogeneous invasive multi-tile architecture to be used for FPGA prototyping. It is based on a thermal RC modeling approach. Also different usage scenarios of TCPA are analyzed and compared.

متن کامل

Improvement of Navigation Accuracy using Tightly Coupled Kalman Filter

In this paper, a mechanism is designed for integration of inertial navigation system information (INS) and global positioning system information (GPS). In this type of system a series of mathematical and filtering algorithms with Tightly Coupled techniques with several objectives such as application of integrated navigation algorithms, precise calculation of flying object position, speed and at...

متن کامل

PipeIt: A Pipeline Programming Framework For Embedded Processor Array Systems-on-Chip

This paper presents the PipeIt framework for developing pipelined applications targeted at tightly-coupled processor arrays on a chip. The framework includes a component programming and wiring model, a runtime environment, and a corresponding toolchain. It enables the programmer to develop applications in a high-level manner, structuring the code at the finest possible/meaningful level of granu...

متن کامل

Getting More From Your Multicore: Exploiting OpenMP From An Open Source Numerical Scripting Language

We introduce SLIRP, a module generator for the S-Lang numerical scripting language, with a focus on its vectorization capabilities. We demonstrate how both SLIRP and S-Lang were easily adapted to exploit the inherent parallelism of high-level mathematical languages with OpenMP, allowing general users to employ tightly-coupled multiprocessors in scriptable research calculations while requiring n...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016